-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lints to ensure link text for EIPs should match the EIP's number #99
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I'd try to avoid hard coding regexes where possible. If they're hard coded, they can't be changed by the configuration file.
eipw-lint/src/lib.rs
Outdated
( | ||
"markdown-link-other", | ||
MarkdownLinkOther { | ||
pattern: markdown::LinkOther(r"^(EIP|ERC)-(\d+)\s*\S*$") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be case insensitive ((?i)
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have the other lint to check for incorrectly cased references. So even if [eip-1](./eip-1.md)
passes markdown-link-other
, it'll fail the other lint.
Or at least I think it will 🤔
if self.link_depth > 0 { | ||
self.link_depth = self.link_depth.checked_sub(1).unwrap(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If self.link_depth
is greater than zero, the checked_sub
is unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checked_sub
in the depart_link
function returns self.link_depth to 0. I check link_depth in enter_text
fn. If link_depth == 0, it means the text is not from the link, so, I skip it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that if self.link_depth > 0
then self.link_depth - 1
can never underflow. The checked_sub
will never fail.
if self.link_depth > 0 { | ||
self.current_link.text = txt.to_owned(); | ||
self.check(ast)?; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this behave for content like:
[**EIP-1**5678](./eip-1.md#rationale)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It fails. I will add it to the tests. Should it fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, perhaps I spoke too soon then 🤣
I was worried that because every text node in the link sets self.current_link.text
, you could end up in situations where, for example, bold would create two text nodes, breaking the lint. Maybe something like [**EIP-1**EIP-1](./eip-1.md)
.
If my concerns are unfounded, ignore me!
| | ||
4 | [EIP-1](./eip-2.md) | ||
| | ||
= info: link text should match `[EIP|ERC-2]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you can, I'd rather the specific expected text in the message. Maybe something like:
error[markdown-link-eip]: link text does not match link destination
|
4 | [EIP-1](./eip-2.md)
|
= help: `use [EIP-2](./eip-2.md)` instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am working on the help hint. I check the link and build the correct version of the text based on the link. Is it correct? So, my hints will look like this:
[EIP-1](./eip-2.md) -> [EIP-2](./eip-2.md)
[EIP-1: Foo](./eip-1.md) -> [EIP-1](./eip-1.md)
[Another Proposal](./eip-1.md) -> [EIP-1](./eip-1.md)
[EIP-1](./eip-1.md#motivation) -> [EIP-1: Motivation](./eip-1.md#motivation)
[EIP-2: Hello](./eip-1.md#abstract) -> [EIP-1: Abstract](./eip-1.md#abstract)
[Another Proposal](./eip-1.md#rationale) -> [EIP-1: Rationale](./eip-1.md#rationale)
Example:
error[markdown-link-eip]: link text does not match link destination
|
4 | [Another Proposal](./eip-1.md#rationale)
|
= help: ` [EIP-1: Rationale](./eip-1.md#rationale)` instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks great!
| | ||
4 | [Another Proposal](./eip-1.md#rationale) | ||
| | ||
= info: link text should match `[EIP|ERC-1<section-description>]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar kind of comment here. The help text should (if possible) include the correct text to use. Authors might not understand how to make something "match".
Pushed updates. I tried to fix everything according to comments. Haven't found EIP-1 example in existing eips. So, for now ignoring you)) Can we create a new issue for it in case it comes later? Hardcoded regexes in lib.rs are kind of a style for the whole file, I followed the pattern. If we need to change regexes in config files, we need to refactor it globally. |
The configuration in Other regexes, like this one cannot be overridden. |
I've added a test in 09d7de7 that demonstrates the problem I brought up in #99 (comment). |
This one is dynamic. It depends on the lib.rs pattern. I extract the number of eip there and put into a new regex. Do you have any better ideas? |
fixed bold text |
@SamWilsn, I've done with the fixes. Could you please review it and merge if there are no any comments? |
fn extract_capture(&self, text: &str, re: &Regex, index: usize) -> Result<String, Error> { | ||
if let Some(captures) = re.captures(text) { | ||
Ok(captures | ||
.get(index) | ||
.map(|m| m.as_str().to_string()) | ||
.unwrap_or_default()) | ||
} else { | ||
Ok(String::new()) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function never returns an error, so you can simplify it a bit:
fn extract_capture(&self, text: &str, re: &Regex, index: usize) -> Result<String, Error> { | |
if let Some(captures) = re.captures(text) { | |
Ok(captures | |
.get(index) | |
.map(|m| m.as_str().to_string()) | |
.unwrap_or_default()) | |
} else { | |
Ok(String::new()) | |
} | |
} | |
fn extract_capture(&self, text: &str, re: &Regex, index: usize) -> String { | |
if let Some(captures) = re.captures(text) { | |
captures | |
.get(index) | |
.map(|m| m.as_str().to_string()) | |
.unwrap_or_default() | |
} else { | |
String::new() | |
} | |
} |
Ok(captures | ||
.get(index) | ||
.map(|m| m.as_str().to_string()) | ||
.unwrap_or_default()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say that if the regular expression has the wrong number of capture groups, we should inform the user instead of silently returning the empty string.
You can ignore my previous comment about simplifying, and use Error::custom
and something like:
#[derive(Debug, Snafu)]
struct BadRegex;
let url_eip_text = self.extract_capture(&self.current_link.url, &self.re, 1)?; | ||
let url_eip_number = self.extract_capture(&self.current_link.url, &self.re, 2)?; | ||
let url_section = self.extract_capture(&self.current_link.url, &self.re, 4)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This repeats the regex search each time, which is pretty inefficient.
fn check(&self, ast: &Ast) -> Result<Next, Error> { | ||
let url_eip_text = self.extract_capture(&self.current_link.url, &self.re, 1)?; | ||
let url_eip_number = self.extract_capture(&self.current_link.url, &self.re, 2)?; | ||
let url_section = self.extract_capture(&self.current_link.url, &self.re, 4)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eipw-lint
already depends on url for parsing URLs. You could use it here to get the fragment (#...
).
let section_description = Visitor::transform_section_description(&url_section); | ||
format!( | ||
"[{}{}: {}]({})", | ||
url_eip_text.to_uppercase(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weirdly enough, ERCs are still stored in files named like eip-1234.md
, so you can't use the filename to predict whether you need "EIP-..." or "ERC-..."
The correct way to solve this (reading the linked file) won't work for reasons outside eipw
's scope, so... I guess the best we can do is something like:
help: use
[EIP-1237](./eip-1237.md)
or[ERC-1237](./eip-1237.md)
instead
@SamWilsn I updated Rust (1.80.1 the latest stable version) and now have this issue with |
If you're using rustup, you can use, for example, I'll update the rust version in a separate PR. |
Issue 67. Covered examples with tests.